F-term classification Experiments at NTCIR-6 for Justsytems
نویسنده
چکیده
We conducted the classification subtask at NTCIR6 Patent Retrieval Task using a system based on three document classifiers, namely, a one-vs-rest SVM classifier, multi-topic classifier, and binary Naive Bayes classifier. The multi-topic classifier was constructed on the basis of the maximum margin principle and applied to multiple F-term classification. From the experimental results, this multi-topic classifier yielded a higher F1 value than the one-vs-rest SVM in many cases. In addition, we employed the one-vs-rest SVM classifier. The SVM classifier has certain drawbacks such as low recall performance and large learning time. In order to solve these problems, we used heuristics for achieving random reduction of a part of the negative examples and division of learning. These procedures lead to a reduction in learning time and improve the classification performance when appropriate parameters are set.
منابع مشابه
SVM Based Learning System for F-term Patent Classification
This paper describes our SVM-based system and the techniques we used to adapt the approach for the specifics of the F-term patent classification subtask at NTCIR-6 Patent Retrieval Task. Our system obtained the best results according to two of the three measures used for performance evaluation. Moreover, the results from some additional experiments demonstrate that our system has benefited from...
متن کاملUsing the K-Nearest Neighbor Method and SMART Weighting in the Patent Document Categorization Subtask at NTCIR-6
Patent processing is important in industry, business, and law. We participated in the classification subtask (at NTCIR-6 Patent Retrieval Task), in which, we classified patent documents into their F-terms using the knearest neighbor method. For document classification, F-term categories are both very precise and useful. We entered five systems in the classification subtask and obtained good res...
متن کاملOverview of Classification Subtask at NTCIR-6 Patent Retrieval Task
This paper describes the Classification Subtask of the NTCIR-5 Patent Retrieval Task. The purpose of this subtask is to evaluate the methods of classifying patents into multi-dimensional classification structures called F-term (File Forming Term) classification systems. We report on how this subtask was designed, the test collection released, and the results of the evaluation.
متن کاملTerm Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
In the present paper, a term weighting classification method using the chi-square statistic is proposed and evaluated in the classification subtask at NTCIR-6 patent retrieval task. In this task, large numbers of patent applications are classified into Fterm categories. Therefore, a patent classification system requires high classification speed, as well as high classification accuracy. The chi...
متن کاملSentence Level Subjectivity and Sentiment Analysis Experiments in NTCIR-7 MOAT Challenge
This paper describes our supervised approach to the opinionated and the polarity subtasks in the NTCIR-7 MOAT Challenge. We apply a sequential tagging approach at the token level and use the learned token labels in the sentence level classification tasks. In our formal run submissions, we utilized SVM in both tasks with syntactic and lexicon-based features. Additionally, we present our experime...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007